DNA Testing companies struggle with trying to provide a broad-based, simplified interface to what is a complex and constantly evolving target. While different testing companies provide a varying degree of value beyond the lab test service; all allow for the export of the raw results. So "citizen scientists" (as they like to call themselves) work to develop bleeding edge new technology and tools to address the limitations. This has opened up a whole host of third-party tools geared at helping you get more our of your test results. They enable the export of Microarray Testing data from testing companies and then the import into and processing by these new third party analysis tools.
Note: As of late May 2020, ALL 3rd party tools received "cease and desist" letters from Ancestry. So virtually no third party tool can extract DNA match data from Ancestry any longer. This includes browser add-ins as well as clustering tools that gathered the data from Ancestry for you. This is a major blow to the community to making use of the DNA test match data in the Ancestry database. This along with Ancestry not providing matching segment data really hampers any useful genetic genealogy research using the site. Lets hope Ancestry has done this because they plan on coming out with their own tools in this area shortly. But we suspect not. Yet another reason to drop Ancestry from genetic genealogy; even though they have the largest match database and most extensive records library (outside maybe FamilySearch).
(See also Dave Tang's comments on how to setup a Virtual Machine to run the Linux tools in Windows. Similar for other platforms. For those that do not work Unix or Linux as a native. One of our admins has been on Unix since 1974 so ...)
There are a number of cross-overs occurring now, to compete with the dominance happening quickly with Ancestry in the genetic genealogy space.
Note: As of late May 2020, ALL 3rd party tools received "cease and desist" letters from Ancestry. So virtually no third party tool can extract DNA match data from Ancestry any longer. This includes browser add-ins as well as clustering tools that gathered the data from Ancestry for you. This is a major blow to the community to making use of the DNA test match data in the Ancestry database. This along with Ancestry not providing matching segment data really hampers any useful genetic genealogy research using the site. Lets hope Ancestry has done this because they plan on coming out with their own tools in this area shortly. But we suspect not. Yet another reason to drop Ancestry from genetic genealogy; even though they have the largest match database and most extensive records library (outside maybe FamilySearch).
Autosomal Analysis Tools, Sites and Apps
Site | Notes |
GEDMatch | Initially grown out of the sister surname project RogersDNA to auto-compare GEDCom files, this site is the pre-eminent place to compare Autosomes DNA results independent of the testing company. With well over 500,000 uploaded results, it also is a highly collaborative site. Filled with people who want to explore the connection (as opposed to the complaint with most testing company websites that most are not interested in exploring if there is a common connection). Their extended Tier 1 tools for a fee are the best way to do deep segment analysis without requiring access to the DNA source files from each match (that desktop tools tend to require). |
Genealogical DNA Analysis Tool | (for many years known as GenomeMatePro or GMP; now GDAT) a newly established desktop tool for managing match lists and dong matching segment analysis (if the test site supports segments); by Becky Walker. Has extensive features but can be difficult to use effectively. See Wes Johnson's notes on a first time experience (with the original GMP). Best to utilize the DNAGedCom app/client (shown below) and PedigreeThief Chrome Extension as the tools to import the segment match and tree data from other sources. The "experts tool" that has become extremely popular. The tool is very option filled and difficult to master properly. (Download was previously located at http://genomate.org/ then moved to http://getgmp.com/). |
DNA Kit Studio | And other tools by Wilhelm Halys-O. Reads most autosomal microarray file formats and provides some manipulation and analysis. Most importantly, has a merge feature for those tested on multiple sites that creates a "super-kit". (Formerly at Wilhelm Genealogy. |
Pedigree Thief | OK, not directly DNA related, but used by many to capture pedigrees from Ancestry and MyHeritage matches to then use in tools like GenomeMate Pro. After all, you need a pedigree with a match to figure out the overlap. Built as an extension to Chrome; as opposed to a stand-alone desktop app. Also has been expanded to capture match lists and matching segment data now. Provides a better mechanism for MyHeritage capture than DNAGEDCom, for example. |
Shared Clustering | Jonathan Brecher's tool to create cluster maps. Free, Windows10 and Ancestry only. |
AutoClusters | From GeneticAffairs. The tool adopted in and provided by MyHeritage and GEDMatch Tier1 now. |
DNAGedcom | Rob Warthen's online upload and wrapper for running other tools on his server. Tools like Don Worth's Autosomal DNA Segment Analyzer (ADSA) and Kitty Cooper's Kworks. New desktop client pay tool and a (chrome extension) for some additional features. Includes a cluster tool in the desktop tool now . Overview information from their PDF: "What you can do with DNAGEDCom?". Being replaced by / merged with the new development Genetic.Family |
Genetic.Family | A brand-new, still in development site / server to manage match lists and do segment match analysis. Think of GenomeMate Pro but online in the cloud and not in the desktop. Does not analyze the microarray file formats of SNPs like GEDMatch but instead works from the matching segments (if available), match lists, and genealogical tree data made available by the various test companies. Meant to replace DNAGEDCom going forward; once fully functional. Key is it stores data uploaded in the cloud and does not allow it to be subsequently removed. Key because match list information of your shared matches is snared as part of the grab and upload. So you may be unwittingly disclosing and permanently transferring information of your matches to their central DB. |
Double Match Triangulator | Louis Kessler's entry into the tools market; the most recent entry in the burgeoning field of cluster analysis tools. |
AncestryDNA Helper | Snavely's chrome extension to help you extract information out of Ancestry for use with the above tools; especially DNAGEDCom and GenomeMate Pro. See Barbara Taylors developed help file. |
DNA Match Manager | Heirloom Software's tool to download match files to CSV's from multiple companies into a common format. Similar to paid DNAGEDCom tools but fewer features. Many claim10x+ faster but may not be digging deeper for the harder to acquire data; especially triangulated segments. |
Kitty Cooper's Utilities | A number of utilities by Kitty Cooper for visualizing your matching |
Y DNA Analysis Tools
Site | Notes |
Cladefinder | Hunter Provyn's Cladefinder done under funding with Thomas Krahn's ySeq. Uses the latest yFull tree model and SNP list. Takes in microarray as well as WGS VCF files (must be subsetted to Y only for size). Provides the deepest quick measurement using latest tree info. Hunter has a number of other tools for visualizing data in from the yFull tree. |
Morley Y DNA Analysis | by sister surname project MorleyDNA; is supporting an ad-hoc tool that can extract the yDNA SNP results from Ancestry, 23andMe and even NGG and perform a mapping onto the ISOGG or his experimental phylogenetic tree. Has not been updated for the latest SNPs so may not get as deep as the test results contain; but "good enough for horseshoes" as they say. Definitely the easiest, quickest method to see the Haplogroup results and get a predicted "deepest" branch in the a phylogenetic tree. Has not been updated in many years but the microarray test results only go as deep as the trees were at the time developed; so still useful. |
NevGen | Offers a refined prediction of the haplogroup based on only (STR test results. Often much, much deeper than provided by FTDNA which only utilizes the y12 result to predict a major (ancient) haplogroup.. Use the online tool to get the refined subclades option. The more STR values tested, the better the prediction. Provides a confidence factor to the prediction. Developed by scraping FTDNA projects for STR results and BigY tester haplogroups listed there. Often can get deeper than STR match list members and their BigY results. |
yFull | is a third party site from the University of Moscow that takes in your Sequencing yDNA results and extracts out known SNP, STR, and other markers. There is a charge but most feel it is worth it. If you do BigY with FTDNA, you want to use this site. (Some express concern with paying by credit card and uploading to a site in Russia.) yFull has a deeper. more active Haplogroup Tree development than ISOGG and often even FTDNA. |
MitoYDNA.org | In early use. Aimed as a replacement for ySearch and MitoSearch which closed down in May 2018. Providing a yDNA STR match service / public database and mtDNA SNP upload and match service. It should be noted that yFull has been expanding to include this as part of their suite of capabilities. But yFull mostly focus on Sequencing result uploads and not microarray test results like mitoYdna.. |
| |
McGee Tools | The sister surname project McGee has some tools (now Javascript based) that calculate the genetic distance and predicted time to a common ancestor (TMRCA after selecting parameters. Helpful for exploring past the simple analysis given on the FTDNA site. |
Chase Washley YDNA Groupings website app | An app that takes in the STR marker value charts of a project and provides analysis of basic grouping. |
David Vance's SAPP and SNPTree | Really tools for project managers to plot family group / surname branch results as trees (cladograms). Based on Ken Nordvedt's Intraclade estimation described here. |
McDonald Calculator | Estimates the TMRCA between two STR haplotypes |
Charles Warthen (and Wes Erickson) Tools | Southern California has been the birth and growth place of the citizen scientist genetic genealogy topic. From the Southern California Genealogy Society in Burbank's pioneering work to similar groups such as the North San Diego County Genealogical Society that Charlie is involved with. Charlie, with the help of his former student Wes, have developed some interesting tools that help the individual and surname study group push further. |
Felix has gone off line since 2015 and most of these tools no longer work. | |
Like GEDMatch but for yDNA STR values. Match service independent of test company. Made a difference when multiple test companies providing STR testing. But all folded except FTDNA. Was being supported / run by FTDNA but closed when EU GDPR regulations came into effect. |
- There is a nice explanation of the genetic distance calculation from Haplotype (i.e. STRs) at the sister Wright-DNA project.
- See the ISOGG Y-DNA Tools page for additional entries.
Mito DNA Analysis Tools
Site | Notes |
mtHap tool | for mtDNA analysis and haplogroup determination (web version) |
MitoTools | for mtDNA analysis and haplogroup determination (web version or download for major desktop platforms) |
MitoYDNA | A new, in-development tool to replace the ySearch and MitoSearch of old. Provide independent, 3rd party match analysis. See description above for yDNA applicability. Not clear how much benefit this will provide given the move to full mtDNA sequencing and the active development tree at yFull for mtDNA. |
Similar to ySearch mentioned above; a tool to upload mtDNA SNP VCF results and compare to others. Mostly focused on HVR1/HVR2 region tests that were popularized by FTDNA and provided as a dozen or so derived SNP values. | |
Went away, as a service, in May 2018 when EU GDPR regulations took effect. Pre-dated most full mtDNA testing and results which has dozens of SNPs returned. Being replaced by non-profit, startup mitoYSearch mentioned above. |
NGS Tools
We need to backport our work developing in the Dante Labs and Nebula Genomics Facebook Group on WGS tools and techniques into this Wiki. Until then, see the entry document Bioinformatics for Newbies. We now develop the WGSExtract tool along with others.Site | Notes |
WGS Extract | A cross platform tool for analyzing NGS result files and extracting the necessary content to feed genetic genealogy sites. Enabling the replacement of CE and microarray testing with 30x WGS as the primary test method in all areas. |
HTSLib Tools | SAMTools, BAMTools, HTSFile, etc. for processing Sequencing test result files (See also Wikipedia, GitHub Repository, and original 1KGenome Project paper. Try out BAM Bio for a quick, online version for inside your browser of basic samtools functions. |
GATK, Picard, et al | The Broad Institute (Harvard Medical School, MIT} suite of processing tools. A good compliment and alternate method for many of the features in HTSLib above. |
BEDToolsv2 | BEDTools tool kit from Univ of Utah, et al. See also their in-browser tools of BAM.iobio, Qual.iobio, and VCF iobio (and actually others) ]. |
FASTX-Toolkit | FASTX processing tools for FASTA/FASTQ files |
Felix has gone off line since 2015. But many of his tools are still relevant. Felix's site of tools, utilities and articles and blog experiences is rich across the whole DNA test spectrum; not just for the Y chromosome as the name implies. Specifically see GGK for doing Autosomes matching with multiple microarray file formats files. And a 23andMe converter and companion Google Chrome extension that turns your view of the online ISOGG Haplogroup Phylogenetic tree into a personalized report to help understand and find your Haplogroup. And also Whit Athey's Haplogroup Predictor. Felix has stopped development and many of his tools no longer work. Specifically, the Google Chrome Extension with the ISOGG tree webpages. | |
LobSTR | The first STR extractor from sequencing files. Basically, what yFull and now FTDNA are doing with BigY and similar result files. Note: this is advanced use. Maybe simply use yFull or FTDNA's new BigY processing. Other, more recent tools in the same vein are hipSTR, GangSTR, STRetch, and RepeatSeq |
(See also Dave Tang's comments on how to setup a Virtual Machine to run the Linux tools in Windows. Similar for other platforms. For those that do not work Unix or Linux as a native. One of our admins has been on Unix since 1974 so ...)
Notes and Other
- Project leaders have spreadsheets and other support material to extract your ISOGG nomenclature yDNA SNP results from your downloaded 23andMe or AncestryDNA test data and map it into other versions of the ISOGG yDNA Haplogroup tree. Either into the 2010 tree FTDNA used to use (so you can do apples-to-apples comparisons there) or into the latest tree available to understand the latest terminal SNP's they have. ISOGG has fallen way behind FTDNA and other sites like yFull on Haplogroup tree development. This Surname DNA project relies on Ancestry (autosomal only), FTDNA (yDNA only), yFull and yTree for analysis of results; and WGS Test Companies where possible to avoid the high cost, piecemeal testing, and closed database of FTDNA. Report to project admins your SNP results from other test companies and they will make notes attached to your FTDNA account to help in placing you into the right subgroup (if you have a basic STR test there only). The 23andMe Haplogroup tool of old is good also for both yDNA and Mitochondrial analysis; spreadsheets here are most helpful for the AncestryDNA yDNA results.
There are a number of cross-overs occurring now, to compete with the dominance happening quickly with Ancestry in the genetic genealogy space.
- MyHeritage, after earlier partnering, has developed there own branded Autosomal test kit and allows the import of results from other sites like Ancestry. Once you tag a living relative with the test results, they help propagate it appropriately up a patriline, matriline or near-term relatives if autosomal. This allows you to find matching DNA relatives who may share tree fragments as well. Something GEDMatch has not quite been able to achieve with separate DNA and GEDCom uploads. Being adopted into Geni since MyHeritage purchased them. And Legacy Family Tree since purchased by MyHeritage also.
- WikiTree has adopted incorporating DNA results into their one-world-tree concept. Thus allowing you to look for potential matches by finding GEDMatch kit numbers of others within "matching" range. They have some helpful Wiki documentation on their supported process. Appears Geni, Legacy and others are quickly "following suite".
- ISOGG has many useful descriptions, tables and links to resources as well. Historically very biased towards a particular testing company but becoming less so each day. Somewhat of a spin out of the SCGS.
- Promethease and based on SNPedia. [Note: Promethease and SNPedia have been purchased by MyHeritage over the summer of 2019]